-
Notifications
You must be signed in to change notification settings - Fork 3
Add masked table column literal comperison masking rewrite #418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… implemented [RUN CI]
# If the literal is a list or tuple, convert each element | ||
# individually and create an array literal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed because now, with the recent changes, a "list literal" can contain non-literal expressions, e.g. a list of function calls [MASK("foo"), MASK("bar")]
…/masked_relational_rewrite
"server masked": true, | ||
"unprotect protocol": "PTY_UNPROTECT({}, 'deName')", | ||
"protect protocol": "PTY_PROTECT({}, 'deName)", | ||
"protect protocol": "PTY_PROTECT({}, 'deName')", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed earlier, cuases a bug in this PR because protect protocol
isn't actually used until now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job Kian!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
I just have some questions
- `UNMASK(x) == literal` -> `x == MASK(literal)` | ||
- `literal == UNMASK(x)` -> `MASK(literal) == x` | ||
- `UNMASK(x) != literal` -> `x != MASK(literal)` | ||
- `literal != UNMASK(x)` -> `MASK(literal) != x` | ||
- `UNMASK(x) IN (literal1, ..., literalN)` -> `x IN (MASK(literal1), ..., MASK(literalN))` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Questions:
1- What would happen if the literal value is None
?
2- Is it possible to have nested calls? UNMASK(UNMASK(x)) == literal
. If yes, then how would that be handled?
3- Why only equality/inequality? What about other comparison operations (>, >=, <, <=)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen if the literal value is None ?
Same as any other literal: UNMASK(x) == None
-> x == UNMASK(None)
. However, this shouldn't be done in practice since ANYTHING == None
always returns NULL.
Is it possible to have nested calls? UNMASK(UNMASK(x)) == literal
Shouldn't be, since unmasking is only done to convert the scanned table data (which is protected) to its unprotected form. If such a pattern occurs, then that means a bug has happened somewhere.
Why only equality/inequality? What about other comparison operations (>, >=, <, <=)?
Because masking/unmasking preserves equality/inequality, which isn't true for others. For example:
UNMASK(column) == 16
is equivalent tocolumn == MASK(16)
UNMASK(column) < 16
is NOT the same ascolumn < MASK(16)
, since the mask/unmask does not preserve ordinality of values
Followup to #417, performs the following rewrites on relational algebra expressions involving UNMASK operations:
UNMASK(expr) == literal
->expr == MASK(literal)
UNMASK(expr) != literal
->expr != MASK(literal)
ISIN(UNMASK(expr), [literal1, literal2, ...])
->ISIN(expr, [MASK(literal1), MASK(literal2), ...])
These rewrites are done through an additional shuttle run during relational simplification, which is only activated if the environment variable
PYDOUGH_ENABLE_MASK_REWRITES
is set to 1.